DRABAL: novel method to mine large high-throughput screening assays using Bayesian active learning

نویسندگان

Othman Soufan

Wail Ba-alawi

Moataz Afeef

Magbubah Essack

Panos Kalnis

Vladimir B. Bajic

چکیده

BACKGROUND Mining high-throughput screening (HTS) assays is key for enhancing decisions in the area of drug repositioning and drug discovery. However, many challenges are encountered in the process of developing suitable and accurate methods for extracting useful information from these assays. Virtual screening and a wide variety of databases, methods and solutions proposed to-date, did not completely overcome these challenges. This study is based on a multi-label classification (MLC) technique for modeling correlations between several HTS assays, meaning that a single prediction represents a subset of assigned correlated labels instead of one label. Thus, the devised method provides an increased probability for more accurate predictions of compounds that were not tested in particular assays. RESULTS Here we present DRABAL, a novel MLC solution that incorporates structure learning of a Bayesian network as a step to model dependency between the HTS assays. In this study, DRABAL was used to process more than 1.4 million interactions of over 400,000 compounds and analyze the existing relationships between five large HTS assays from the PubChem BioAssay Database. Compared to different MLC methods, DRABAL significantly improves the F1Score by about 22%, on average. We further illustrated usefulness and utility of DRABAL through screening FDA approved drugs and reported ones that have a high probability to interact with several targets, thus enabling drug-multi-target repositioning. Specifically DRABAL suggests the Thiabendazole drug as a common activator of the NCP1 and Rab-9A proteins, both of which are designed to identify treatment modalities for the Niemann-Pick type C disease. CONCLUSION We developed a novel MLC solution based on a Bayesian active learning framework to overcome the challenge of lacking fully labeled training data and exploit actual dependencies between the HTS assays. The solution is motivated by the need to model dependencies between existing experimental confirmatory HTS assays and improve prediction performance. We have pursued extensive experiments over several HTS assays and have shown the advantages of DRABAL. The datasets and programs can be downloaded from https://figshare.com/articles/DRABAL/3309562.Graphical abstract.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Nootropic Medicinal Plants; Evaluating Potent Formulation By Novelestic High throughput Pharmacological Screening (HTPS) Method

The principle of this method was to screen the pharmacological activity of six prepared polyphyto formulations by using high throughput screening method for their nootropic action. The study was performed in three stages using one, two and three animals, respectively in a group. Test formulations were given p.o daily at the dose of 50 and 100 mg/kg body weight. The test formulations were compar...

متن کامل

A Droplet Microfluidics Based Platform for Mining Metagenomic Libraries for Natural Compounds

Historically, microbes from the environment have been a reliable source for novel bio-active compounds. Cloning and expression of metagenomic DNA in heterologous strains of bacteria has broadened the range of potential compounds accessible. However, such metagenomic libraries have been under-exploited for applications in mammalian cells because of a lack of integrated methods. We present an inn...

متن کامل

Multiplexed Experiment Design in High-Throughput Screening

An early step in the drug discovery involves screening through numerous chemical compounds to find molecules that have a specified biological or biochemical effect. This screening process is costly, thus any modifications to make the screen more efficient would reduce the total cost of drug discovery. Current high-throughput screening methods test each compound from a library of compounds indiv...

متن کامل

Reinforcement Learning Using Gaussian Processes for Discretely Controlled Continuous Processes

In many application domains such as autonomous avionics, power electronics and process systems engineering there exist discretely controlled continuous processes (DCCPs) which constitute a special subclass of hybrid dynamical systems. We introduce a novel simulation-based approach for DDCPs optimization under uncertainty using Reinforcement Learning with Gaussian Process models to learn the tra...

متن کامل

Evaluation of different machine learning methods for ligand-based virtual screening

In silico High Throughput Screening of large compound databases has become increasingly popular technology of finding valuable drug candidates, by applying a wide range of computational methods, such as machine learning [1]. In recent years, many comparative studies of different machine learning methods performance in ligandbased virtual screening have been reported [2,3]. In order to extend th...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 8 شماره

صفحات -

تاریخ انتشار 2016

DRABAL: novel method to mine large high-throughput screening assays using Bayesian active learning

نویسندگان

چکیده

منابع مشابه

Nootropic Medicinal Plants; Evaluating Potent Formulation By Novelestic High throughput Pharmacological Screening (HTPS) Method

A Droplet Microfluidics Based Platform for Mining Metagenomic Libraries for Natural Compounds

Multiplexed Experiment Design in High-Throughput Screening

Reinforcement Learning Using Gaussian Processes for Discretely Controlled Continuous Processes

Evaluation of different machine learning methods for ligand-based virtual screening

عنوان ژورنال:

اشتراک گذاری